Interfacing of CASA and Multistream recognition

نویسندگان

  • Hervé Glotin
  • Frédéric Berthommier
  • Emmanuel Tessier
  • Hervé Bourlard
چکیده

In this paper we propose a running demonstration of coupling between an intermediate processing step (named CASA), based on the harmonicity cue, and partial recognition, implemented with a HMM/ANN multistream technique 2]. The model is able to recognise words corrupted with narrow band noise, either stationary or having variable center frequency. The principle is to identify frame by frame the most noisy subband within four subbands by analysing a SNR-dependent representation. A static partial recogniser is fed with the remaining subbands. We establish on NUMBERS93 the noisy-band identiication (NBI) performance as well as the word error rate (WER), and alter the correlation between these two indexes by changing the distribution of the noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interfacing of CASA and partial recognition based on a multistream technique

We propose a running demonstration of coupling between an intermediate processing step (named CASA), based on the harmonicity cue, and partial recognition, implemented with a HMM/ANN multistream technique [2]. The model is able to recognise words corrupted with narrow band noise, either stationary or having variable center frequency. The principle is to identify frame by frame the most noisy su...

متن کامل

A New Snr-feature Mapping for Robust Multistream Speech Recognition

We describe a new model of CASA labelling which assigns to each time-frequency region a probability "clean" enough to feed a multistream recogniser only adapted to clean data. This labelling process is based on the harmonicity of the speech. The probability is evaluated according to a SNR-feature mapping and the choice of a SNR decision threshold. This allows an extension of a previous method [...

متن کامل

A Measure of Speech and Pitch Reliability from Voicing

We propose a CASA labelling method of the TF representation, which is based on the periodicity of the speech, related to the voicing. A local voicing index is estimated in four subbands after demodulation of the signal. This index is used as a reliability measure for both pitch identification and speech recognition. First, this model allows robust f0 identification thanks to the voicing index, ...

متن کامل

A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition

We propose a new cocktail-party recognition technique based on the coupling of a CASA-labelling method using the TDOA (Time Delay Of Arrival) with multistream recognition. This is an alternative to the classical "segregate and recognise" architecture. First, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed of binary mixtures of sentences at 0dB, placed left a...

متن کامل

Interfacing Sound Stream Segregation to Automatic Speech Recognition - Preliminary Results on Listening to Several Sounds Simultaneously

This paper reports the preliminary results of experiments on listening to several sounds at once. Two issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (ASR). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000